This journal contains work to identify the data used to train rtrees (contained in both bulletr and bulletxtrctr). We recently discovered some discrepancies between rtrees and the training data we were using (hamby-comparisons.csv, previously features-hamby173and252.csv). Unfortunately, the original work by Eric is not well documented, so it is difficult to know for sure which dataset was used to train the model. However, Eric recently pointed us to some code, which he believes was used to train rtrees. Heike was able to rerun part of the code to extract data from the CSAFE database that was created by the first part of the code (scans to features). The new data (CCFs_withlands.csv) contains the same number of rows the number of predictions associated with rtrees. I have gathered all of our work trying to determine that CCFs_withlands is the data used to train rtrees in this journal.

library(cowplot)
library(dplyr)
library(ggplot2)
library(purrr)
library(randomForest)
library(stringr)
library(tidyr)

Raw Datasets

This section contains code for loading and performing initial comparisons of the raw datasets.

Load Data

Load the data provided by Heike and CSAFE that we have been using and believed to be the training data for rtrees:

hamby_comparisons_raw <- read.csv("../../../data/raw/hamby-comparisons.csv")

Load the data extracted (on 2020-09-23) from the CSAFE database and corresponds to the R script Eric believes to be the one used to train rtrees.

CCFs_withlands_raw <- read.csv("../../../data/raw/CCFs_withlands.csv")

Obtain the features used to train rtrees for use throughout this journal:

rtrees_features <- rownames(bulletxtrctr::rtrees$importance)

Initial Comparisons

Dimensions of the raw datasets:

dim(hamby_comparisons_raw)
## [1] 84666    17
dim(CCFs_withlands_raw)
## [1] 83028    26

Note that CCFs_withlands has the same number of rows as the number of predictions associated with rtrees:

length(bulletxtrctr::rtrees$predicted)
## [1] 83028

Comparing summaries of the distributions of the features used to train rtrees from the two datasets. Note that the main differences are seen with the distributions of D (distance) and sd_D (standard deviation of distance):

summary(hamby_comparisons_raw %>% select(all_of(rtrees_features)))
##       ccf            rough_cor              D                 sd_D      
##  Min.   :0.01407   Min.   :-0.86978   Min.   : 0.05055   Min.   :0.780  
##  1st Qu.:0.21288   1st Qu.:-0.18680   1st Qu.: 2.02500   1st Qu.:2.349  
##  Median :0.27538   Median :-0.05761   Median : 2.77395   Median :2.728  
##  Mean   :0.28940   Mean   :-0.07392   Mean   : 3.31462   Mean   :2.815  
##  3rd Qu.:0.34831   3rd Qu.: 0.04942   3rd Qu.: 4.10920   3rd Qu.:3.217  
##  Max.   :0.98110   Max.   : 0.96355   Max.   :20.79681   Max.   :7.263  
##     matches         mismatches          cms             non_cms      
##  Min.   : 0.000   Min.   : 0.000   Min.   : 0.0000   Min.   : 0.000  
##  1st Qu.: 1.206   1st Qu.: 8.027   1st Qu.: 0.5915   1st Qu.: 3.678  
##  Median : 2.228   Median : 9.494   Median : 1.1378   Median : 4.855  
##  Mean   : 2.444   Mean   : 9.756   Mean   : 1.3562   Mean   : 5.351  
##  3rd Qu.: 3.279   3rd Qu.:11.228   3rd Qu.: 1.7112   3rd Qu.: 6.531  
##  Max.   :18.656   Max.   :33.684   Max.   :15.2131   Max.   :33.684  
##    sum_peaks     
##  Min.   : 0.000  
##  1st Qu.: 1.634  
##  Median : 2.762  
##  Mean   : 3.027  
##  3rd Qu.: 4.102  
##  Max.   :25.990
summary(CCFs_withlands_raw %>% select(all_of(rtrees_features)))
##       ccf            rough_cor              D                  sd_D         
##  Min.   :-0.0127   Min.   :-0.89155   Min.   :0.0003513   Min.   :0.001250  
##  1st Qu.: 0.2124   1st Qu.:-0.20245   1st Qu.:0.0022353   1st Qu.:0.003698  
##  Median : 0.2750   Median :-0.06423   Median :0.0026339   Median :0.004307  
##  Mean   : 0.2890   Mean   :-0.08400   Mean   :0.0027578   Mean   :0.004430  
##  3rd Qu.: 0.3481   3rd Qu.: 0.04686   3rd Qu.:0.0032182   3rd Qu.:0.005088  
##  Max.   : 0.9811   Max.   : 0.96355   Max.   :0.0072105   Max.   :0.011110  
##     matches         mismatches          cms             non_cms      
##  Min.   : 0.000   Min.   : 0.000   Min.   : 0.0000   Min.   : 0.000  
##  1st Qu.: 1.182   1st Qu.: 7.170   1st Qu.: 0.5893   1st Qu.: 3.218  
##  Median : 2.197   Median : 8.225   Median : 1.1348   Median : 4.192  
##  Mean   : 2.403   Mean   : 8.244   Mean   : 1.3432   Mean   : 4.564  
##  3rd Qu.: 3.249   3rd Qu.: 9.289   3rd Qu.: 1.7082   3rd Qu.: 5.555  
##  Max.   :18.656   Max.   :14.947   Max.   :15.2131   Max.   :14.161  
##    sum_peaks     
##  Min.   : 0.000  
##  1st Qu.: 1.566  
##  Median : 2.709  
##  Mean   : 2.982  
##  3rd Qu.: 4.062  
##  Max.   :23.775

This result is even true when the observations in hamby_comparisons known to have tank rash (flag == FALSE) are removed. This seems to suggest that the units of D and sd_D changed from CCFs_withlands:

summary(hamby_comparisons_raw %>% filter(flag == FALSE) %>% select(all_of(rtrees_features)))
##       ccf            rough_cor              D                 sd_D      
##  Min.   :0.01407   Min.   :-0.86978   Min.   : 0.05055   Min.   :0.780  
##  1st Qu.:0.21297   1st Qu.:-0.18430   1st Qu.: 2.02162   1st Qu.:2.347  
##  Median :0.27541   Median :-0.05671   Median : 2.76574   Median :2.724  
##  Mean   :0.28940   Mean   :-0.07166   Mean   : 3.29264   Mean   :2.808  
##  3rd Qu.:0.34822   3rd Qu.: 0.04999   3rd Qu.: 4.07658   3rd Qu.:3.208  
##  Max.   :0.98110   Max.   : 0.96355   Max.   :20.79681   Max.   :7.263  
##     matches         mismatches          cms             non_cms      
##  Min.   : 0.000   Min.   : 0.000   Min.   : 0.0000   Min.   : 0.000  
##  1st Qu.: 1.209   1st Qu.: 8.027   1st Qu.: 0.5915   1st Qu.: 3.675  
##  Median : 2.230   Median : 9.494   Median : 1.1388   Median : 4.848  
##  Mean   : 2.448   Mean   : 9.756   Mean   : 1.3578   Mean   : 5.345  
##  3rd Qu.: 3.282   3rd Qu.:11.228   3rd Qu.: 1.7112   3rd Qu.: 6.525  
##  Max.   :18.656   Max.   :33.684   Max.   :15.2131   Max.   :33.684  
##    sum_peaks     
##  Min.   : 0.000  
##  1st Qu.: 1.638  
##  Median : 2.764  
##  Mean   : 3.030  
##  3rd Qu.: 4.105  
##  Max.   :25.990

Comparing visualizations of the distributions of the features used to train rtrees from the two datasets to again see the differences in distributions of D and sd_D:

bind_rows(
  hamby_comparisons_raw %>%
    filter(flag == FALSE) %>%
    select(all_of(rtrees_features)) %>%
    pivot_longer(cols = everything()) %>%
    mutate(dataset = "hamby_comparisons"),
  CCFs_withlands_raw %>% select(all_of(rtrees_features)) %>%
    pivot_longer(cols = everything()) %>%
    mutate(dataset = "CCFs_withlands")
) %>%
  ggplot(aes(x = value, fill = dataset)) + 
  geom_histogram() +
  facet_grid(dataset ~ name)

Data Cleaning

There are discrepancies in the naming conventions of the two datasets. This section contains the code that identifies the differences and cleans the data.

Extracting Variables of Interest

The following variables are contained in the raw versions of the data. Note that they have different variables and different names. For our analysis, we only need the variables identifying the lands, the features used to train rtrees, the ground truth for whether or not the bullets are a match, and variables flagging bullets with tank rash.

names(hamby_comparisons_raw)
##  [1] "land_id1"         "land_id2"         "ccf"              "rough_cor"       
##  [5] "lag"              "abs_lag"          "D"                "sd_D"            
##  [9] "matches"          "mismatches"       "cms"              "overlap"         
## [13] "non_cms"          "sum_peaks"        "signature_length" "same_source"     
## [17] "flag"
names(CCFs_withlands_raw)
##  [1] "compare_id"       "profile1_id"      "profile2_id"      "ccf"             
##  [5] "rough_cor"        "lag"              "D"                "sd_D"            
##  [9] "signature_length" "overlap"          "matches"          "mismatches"      
## [13] "cms"              "non_cms"          "sum_peaks"        "land_id.x"       
## [17] "land_id.y"        "match"            "study.x"          "barrel.x"        
## [21] "bullet.x"         "land.x"           "study.y"          "barrel.y"        
## [25] "bullet.y"         "land.y"

The code below separates the land id variables in hamby_comparisons_raw into separate columns for study, barrel, bullet, and land and selects only the variables of interest.

hamby_comparisons_select <- 
  hamby_comparisons_raw %>%
  separate(land_id1, c("study1", "barrel1", "bullet1", "land1")) %>%
  separate(land_id2, c("study2", "barrel2", "bullet2", "land2")) %>%
  select(
    study1,
    barrel1,
    bullet1,
    land1,
    study2,
    barrel2,
    bullet2,
    land2,
    all_of(rtrees_features),
    same_source,
    flag
  )

The code below renames some of the variables in CCFs_withlands_raw, selects only the variables of interest, and converts all land labels to characters.

CCFs_withlands_select <- 
  CCFs_withlands_raw %>%
  rename(
    "study1" = "study.x",
    "barrel1" = "barrel.x",
    "bullet1" = "bullet.x",
    "land1" = "land.x",
    "study2" = "study.y",
    "barrel2" = "barrel.y",
    "bullet2" = "bullet.y",
    "land2" = "land.y",
    "same_source"= "match"
  ) %>%
  select(
    study1,
    barrel1,
    bullet1,
    land1,
    study2,
    barrel2,
    bullet2,
    land2,
    all_of(rtrees_features),
    same_source
  ) %>%
  mutate(across(c(bullet1, land1, bullet2, land2), as.character))

Check to make sure the structures of the two datasets agree:

hamby_comparisons_select %>% str()
## 'data.frame':    84666 obs. of  19 variables:
##  $ study1     : chr  "Hamby173" "Hamby173" "Hamby173" "Hamby173" ...
##  $ barrel1    : chr  "Br10" "Br10" "Br10" "Br10" ...
##  $ bullet1    : chr  "B1" "B1" "B1" "B1" ...
##  $ land1      : chr  "L1" "L1" "L1" "L1" ...
##  $ study2     : chr  "Hamby173" "Hamby173" "Hamby173" "Hamby173" ...
##  $ barrel2    : chr  "Br10" "Br10" "Br10" "Br10" ...
##  $ bullet2    : chr  "B1" "B1" "B1" "B1" ...
##  $ land2      : chr  "L2" "L3" "L4" "L5" ...
##  $ ccf        : num  0.187 0.256 0.233 0.349 0.207 ...
##  $ rough_cor  : num  -0.0326 0.1787 0.0808 0.1682 0.0381 ...
##  $ D          : num  2.93 1.43 2.49 2.29 2.66 ...
##  $ sd_D       : num  2.64 1.88 2.57 2.68 2.57 ...
##  $ matches    : num  3.21 3.37 2.37 2.66 1.13 ...
##  $ mismatches : num  7.48 9.54 12.43 9.56 10.69 ...
##  $ cms        : num  1.6 1.68 2.37 1.59 1.13 ...
##  $ non_cms    : num  3.74 2.8 9.47 5.31 5.63 ...
##  $ sum_peaks  : num  4.21 3.63 2.3 3.92 0.45 ...
##  $ same_source: logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ flag       : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
CCFs_withlands_select %>% str()
## 'data.frame':    83028 obs. of  18 variables:
##  $ study1     : chr  "Hamby252" "Hamby252" "Hamby252" "Hamby252" ...
##  $ barrel1    : chr  "10" "10" "10" "10" ...
##  $ bullet1    : chr  "1" "1" "1" "1" ...
##  $ land1      : chr  "1" "1" "1" "1" ...
##  $ study2     : chr  "Hamby252" "Hamby252" "Hamby252" "Hamby252" ...
##  $ barrel2    : chr  "10" "10" "10" "10" ...
##  $ bullet2    : chr  "1" "1" "1" "1" ...
##  $ land2      : chr  "2" "3" "4" "5" ...
##  $ ccf        : num  0.173 0.422 0.175 0.424 0.294 ...
##  $ rough_cor  : num  -0.58411 -0.41022 -0.16741 0.27833 -0.00905 ...
##  $ D          : num  0.00341 0.00332 0.00208 0.00187 0.00214 ...
##  $ sd_D       : num  0.00478 0.00479 0.00359 0.00329 0.00348 ...
##  $ matches    : num  1.09 1.22 2.19 4.09 5.41 ...
##  $ mismatches : num  6.62 9.15 5.42 7 6 ...
##  $ cms        : num  0.547 0.612 1.644 2.338 3.249 ...
##  $ non_cms    : num  4.07 8.13 4.43 2.8 1.5 ...
##  $ sum_peaks  : num  0.303 1.864 1.654 4.414 5.142 ...
##  $ same_source: logi  FALSE FALSE FALSE FALSE FALSE FALSE ...

Matching Labels

The function below extracts the labels from the two datasets for a given variable name var and creates a plot to show the discrepancies between hamby_comparison_select and CCFs_withlands_select.

plot_labels <- function(var, ccf_data) {
  
  # Extract the labels
  hc_labels <-
    unique(c(
      hamby_comparisons_select %>% pull(paste0(var, "1")),
      hamby_comparisons_select %>% pull(paste0(var, "2"))
    ))
  ccfwl_labels <-
    unique(c(
      ccf_data %>% pull(paste0(var, "1")),
      ccf_data %>% pull(paste0(var, "2"))
    ))

  # Plot the labels
  data.frame(data = c(
    rep("hamby comparisons", length(hc_labels)),
    rep("CCFs with lands", length(ccfwl_labels))
  ),
  labels = c(hc_labels, ccfwl_labels)) %>%
    ggplot(aes(x = labels, y = data)) +
    geom_tile() +
    labs(x = "", y = "", title = paste("Comparing labels in variable:", var))  
}

The plots below show that the following changes need to be made to CCFs_withlands_select in order to match with hamby_comparison_select: - study label of Hamby44 needs to be changed to Hamby173 - lettered barrels need to be changed to BrUnk and Br needs to be added to other barrels - barrel letters need to be made the bullet label and B needs to be added to all bullets - L needs to be added to lands

plot_labels(var = "study", ccf_data = CCFs_withlands_select)

plot_labels(var = "barrel", ccf_data = CCFs_withlands_select)

plot_labels(var = "bullet", ccf_data = CCFs_withlands_select)

plot_labels(var = "land", ccf_data = CCFs_withlands_select)

Determine the letters in the CCFs_withlands_select unkown barrels:

all_barrel_labels <-
  unique(c(
    as.character(CCFs_withlands_select$barrel1),
    as.character(CCFs_withlands_select$barrel2)
  ))
letters <- all_barrel_labels[!(all_barrel_labels %in% 1:10)]
letters
##  [1] "B" "C" "D" "E" "F" "H" "J" "L" "M" "Q" "S" "U" "X" "Y" "Z" "A" "G" "I" "N"
## [20] "R" "V" "W"

Clean CCFs_withlands_select so the labels match bullet_train_raw:

CCFs_withlands_labelled <- 
  CCFs_withlands_select %>%
  mutate(
    study1 = ifelse(study1 == "Hamby44", "Hamby173", study1),
    study2 = ifelse(study2 == "Hamby44", "Hamby173", study2),
    bullet1 = ifelse(barrel1 %in% letters, as.character(barrel1), as.character(bullet1)),
    barrel1 = ifelse(barrel1 %in% letters, "Unk", barrel1),
    bullet2 = ifelse(barrel2 %in% letters, as.character(barrel2), as.character(bullet2)),
    barrel2 = ifelse(barrel2 %in% letters, "Unk", barrel2)
    ) %>%
  mutate(
    barrel1 = paste0("Br", barrel1),
    bullet1 = paste0("B", bullet1),
    land1 = paste0("L", land1),
    barrel2 = paste0("Br", barrel2),
    bullet2 = paste0("B", bullet2),
    land2 = paste0("L", land2)
  )

The plots below show that the labels are now in agreement:

plot_labels(var = "study", ccf_data = CCFs_withlands_labelled)

plot_labels(var = "barrel", ccf_data = CCFs_withlands_labelled)

plot_labels(var = "bullet", ccf_data = CCFs_withlands_labelled)

plot_labels(var = "land", ccf_data = CCFs_withlands_labelled)

Final Touches

Create the land ids for both datasets:

hamby_comparisons <-
  hamby_comparisons_select %>%
  mutate(
    land_id1 = paste(study1, barrel1, bullet1, land1, sep = "-"),
    land_id2 = paste(study2, barrel2, bullet2, land2, sep = "-")
  ) %>%
  select(land_id1, land_id2, all_of(rtrees_features), same_source, flag)

CCFs_withlands <-
  CCFs_withlands_labelled %>%
  mutate(
    land_id1 = paste(study1, barrel1, bullet1, land1, sep = "-"),
    land_id2 = paste(study2, barrel2, bullet2, land2, sep = "-")
  ) %>%
  select(land_id1, land_id2, all_of(rtrees_features), same_source)

The structures of the cleaned data:

hamby_comparisons %>% str()
## 'data.frame':    84666 obs. of  13 variables:
##  $ land_id1   : chr  "Hamby173-Br10-B1-L1" "Hamby173-Br10-B1-L1" "Hamby173-Br10-B1-L1" "Hamby173-Br10-B1-L1" ...
##  $ land_id2   : chr  "Hamby173-Br10-B1-L2" "Hamby173-Br10-B1-L3" "Hamby173-Br10-B1-L4" "Hamby173-Br10-B1-L5" ...
##  $ ccf        : num  0.187 0.256 0.233 0.349 0.207 ...
##  $ rough_cor  : num  -0.0326 0.1787 0.0808 0.1682 0.0381 ...
##  $ D          : num  2.93 1.43 2.49 2.29 2.66 ...
##  $ sd_D       : num  2.64 1.88 2.57 2.68 2.57 ...
##  $ matches    : num  3.21 3.37 2.37 2.66 1.13 ...
##  $ mismatches : num  7.48 9.54 12.43 9.56 10.69 ...
##  $ cms        : num  1.6 1.68 2.37 1.59 1.13 ...
##  $ non_cms    : num  3.74 2.8 9.47 5.31 5.63 ...
##  $ sum_peaks  : num  4.21 3.63 2.3 3.92 0.45 ...
##  $ same_source: logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
##  $ flag       : logi  FALSE FALSE FALSE FALSE FALSE FALSE ...
CCFs_withlands %>% str()
## 'data.frame':    83028 obs. of  12 variables:
##  $ land_id1   : chr  "Hamby252-Br10-B1-L1" "Hamby252-Br10-B1-L1" "Hamby252-Br10-B1-L1" "Hamby252-Br10-B1-L1" ...
##  $ land_id2   : chr  "Hamby252-Br10-B1-L2" "Hamby252-Br10-B1-L3" "Hamby252-Br10-B1-L4" "Hamby252-Br10-B1-L5" ...
##  $ ccf        : num  0.173 0.422 0.175 0.424 0.294 ...
##  $ rough_cor  : num  -0.58411 -0.41022 -0.16741 0.27833 -0.00905 ...
##  $ D          : num  0.00341 0.00332 0.00208 0.00187 0.00214 ...
##  $ sd_D       : num  0.00478 0.00479 0.00359 0.00329 0.00348 ...
##  $ matches    : num  1.09 1.22 2.19 4.09 5.41 ...
##  $ mismatches : num  6.62 9.15 5.42 7 6 ...
##  $ cms        : num  0.547 0.612 1.644 2.338 3.249 ...
##  $ non_cms    : num  4.07 8.13 4.43 2.8 1.5 ...
##  $ sum_peaks  : num  0.303 1.864 1.654 4.414 5.142 ...
##  $ same_source: logi  FALSE FALSE FALSE FALSE FALSE FALSE ...

Land ID Differences

This section considers the differences in bullets included in the two datasets. The code below extracts the unique land IDs from the two datasets:

hamby_comparisons_ids = unique(c(hamby_comparisons$land_id1, hamby_comparisons$land_id2))
CCFs_withlands_ids = unique(c(CCFs_withlands$land_id1, CCFs_withlands$land_id2))

Below are the number of bullets contained in each of the datasets. There are less in CCFs_withlands than hamby_comparisons:

length(hamby_comparisons_ids)
## [1] 412
length(CCFs_withlands_ids)
## [1] 408

Identify the land ID in CCFs_withlands but not in hamby-comparisons:

CCFs_withlands_ids[!(CCFs_withlands_ids %in% hamby_comparisons_ids)]
## [1] "Hamby173-Br3-B2-L1"   "Hamby173-BrUnk-BM-L4"

Identify the land ID in hamby-comparisons but not in CCFs_withlands:

hamby_comparisons_ids[!(hamby_comparisons_ids %in% CCFs_withlands_ids)]
## [1] "Hamby173-BrUnk-BE-L1" "Hamby173-BrUnk-BE-L2" "Hamby173-BrUnk-BE-L3"
## [4] "Hamby173-BrUnk-BE-L4" "Hamby173-BrUnk-BE-L5" "Hamby173-BrUnk-BE-L6"

Comparisons to rtrees

Dimensions

One of the aspects which led to the realization that the hamby-comparisons data is not the training data for rtrees is that the number of observations in the data do not agree with the number of predictions in the rtrees model. The number of observations used to train rtrees is the following:

length(bulletxtrctr::rtrees$predicted)
## [1] 83028

Here are the dimensions of the hamby_comparisons data which does not agree with the number of observations used to train rtrees:

hamby_comparisons %>% dim()
## [1] 84666    13

Even with the observations removed that are known to have tank rash (flag != FALSE), the dimensions of the hamby_comparisons data do not agree with rtrees:

hamby_comparisons %>% filter(flag == FALSE) %>% dim()
## [1] 84255    13

In the Hare, Hofmann, and Carriquiry (2017), they describe removing four land impressions that were flagged for quality assessment:

  • Barrel 6 Bullet 2-1
  • Barrel 9 Bullet 2-4
  • Unknown Bullet B-2
  • Unknown Bullet Q-4

These lands should correspond to Hamby 252. However, it is possible to obtain the same number of observations as used to train rtrees by filtering out observations in hamby_comparisons using these barrel, bullet, and land combinations without specifying the study (that is removing observations from both Hamby 173 and Hamby 252).

hamby_comparisons_filtered <- 
  hamby_comparisons %>%
  mutate(bbl1 = str_remove(land_id1, pattern = "Hamby252-|Hamby173-"),
         bbl2 = str_remove(land_id2, pattern = "Hamby252-|Hamby173-")) %>%
  filter(!(bbl1 %in% c("Br6-B2-L1", "Br9-B2-L4", "BrUnk-BB-L2", "BrUnk-BQ-L4") | 
           bbl2 %in% c("Br6-B2-L1", "Br9-B2-L4", "BrUnk-BB-L2", "BrUnk-BQ-L4"))) %>%
  select(-bbl1, -bbl2)
hamby_comparisons_filtered %>% dim()
## [1] 83028    13

While it is possible to use hamby_comparisons to get to the same number of observations as rtrees, it is only a guess as to how the bullets were removed. On the other hand, the number of rows in CCFs_withlands already has the same number of rows as rtrees:

CCFs_withlands %>% dim()
## [1] 83028    12

Old Code

Session Info

sessionInfo()
## R version 4.0.2 (2020-06-22)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] tidyr_1.1.2         stringr_1.4.0       randomForest_4.6-14
## [4] purrr_0.3.4         ggplot2_3.3.2.9000  dplyr_1.0.2        
## [7] cowplot_1.0.0      
## 
## loaded via a namespace (and not attached):
##  [1] rgl_0.100.54            Rcpp_1.0.5              locfit_1.5-9.4         
##  [4] mvtnorm_1.1-1           lattice_0.20-41         zoo_1.8-8              
##  [7] png_0.1-7               assertthat_0.2.1        digest_0.6.25          
## [10] mime_0.9                R6_2.4.1                imager_0.42.3          
## [13] tiff_0.1-5              bulletxtrctr_0.2.0      bulletcp_1.0.0         
## [16] evaluate_0.14           pillar_1.4.6            Rdpack_1.0.0           
## [19] rlang_0.4.7             curl_4.3                miniUI_0.1.1.1         
## [22] TTR_0.24.2              bmp_0.3                 rmarkdown_2.3          
## [25] labeling_0.3            webshot_0.5.2           readr_1.3.1            
## [28] x3ptools_0.0.2.9000     htmlwidgets_1.5.1       igraph_1.2.5           
## [31] munsell_0.5.0           shiny_1.5.0             compiler_4.0.2         
## [34] httpuv_1.5.4            xfun_0.17               pkgconfig_2.0.3        
## [37] htmltools_0.5.0         readbitmap_0.1.5        tidyselect_1.1.0       
## [40] tibble_3.0.3            crayon_1.3.4            withr_2.2.0            
## [43] later_1.1.0.1           MASS_7.3-51.6           grid_4.0.2             
## [46] jsonlite_1.7.1          xtable_1.8-4            gtable_0.3.0           
## [49] lifecycle_0.2.0         magrittr_1.5            scales_1.1.1           
## [52] bibtex_0.4.2.2          stringi_1.5.3           farver_2.0.3           
## [55] promises_1.1.1          smoother_1.1            xml2_1.3.2             
## [58] xts_0.12.1              ellipsis_0.3.1          generics_0.0.2         
## [61] vctrs_0.3.4             tools_4.0.2             manipulateWidget_0.10.1
## [64] glue_1.4.2              hms_0.5.3               crosstalk_1.1.0.1      
## [67] jpeg_0.1-8.1            fastmap_1.0.1           yaml_2.2.1             
## [70] colorspace_1.4-1        gbRd_0.4-11             grooveFinder_0.0.1     
## [73] knitr_1.29